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Insertions and deletions of nucleotides in the genes 
encoding the variable domains of antibodies are natural 
components of the hypermutation process, which may 
expand the available repertoire of hypervariable loop 
lengths and conformations. Although insertion of amino 
acids has also been utilized in antibody engineering, 
little is known about the functional consequences of 
such modifications. To investigate this further, we have 
introduced single-codon insertions and deletions as well 
as more complex modifications in the complementarity- 
determining regions of human antibody fragments with 
different specificities. Our results demonstrate that 
single amino acid insertions and deletions £ 



combination of different heavy and light domains (4). The di- 
versity is further increased by the process of somatic hypermu- 
tation (5) and by receptor editing and revision (6). As the 
germline variable gene repertoire encodes a rather limited 
number of CDR loop lengths (IMGT, the international ImMu- 
noGeneTics data base, Kef. 7), the number of observed canon- 
discovered that B cells evolve the genes encoding immunoglob- 
ulin V domains not only by nucleotide substitution but also 
through an additional mechanism of insertion and deletion of 
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able (V) 1 domains, the heavy (H) and the light (L), which both respond to hapten, peptide, and protein, respectively. This re- 
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s. The antigen specificity of the the loop lengths of an antibody-binding site, it may thus be 
possible to design antibodies optimally suited for recognition of 
a particular class of antigen. Lamminmaki et al. (17) have in 
approach to modify a murine antibody specific for 
They introduced additional residues into CDR2 
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ideations. We have therefore created single-codon insertions 
and deletions as well as more complex modifications in the 
CDR of two human antibody single chain V region fragments 
(scFv) specific for a peptide and a hapten, respectively, and 
investigated the effects on antigen recognition, thermal stabil- 
ity, and protein folding. Our results demonstrate that single 
amino acid insertions in both CDRH1 and H2 and deletions in 
CDRH2 are usually well tolerated and permit production of 
folded proteins despite the fact that the modified loops carry 
amino acids that are disallowed at key residue positions in 
canonical loops of the corresponding length or do not take on a 
characteristic length of a known canonical structure. Modifica- 
tions of this kind are in other words an efficient mode of 
expanding antibody sequence and structure space beyond what 
is encoded by the germline gene repertoire, which may enable 
targeting of novel or otherwise poorly immunogenic antigens. 



\ntibody Frameworks— The frameworks encoding the antt-cytoineg- 
virus scFv AE11F and the anti-fluorcscein isothiocyanate (FITC) 
V FITC8 have been described elsewhere (18-20). The cloning and 
duction of the AE11F and AE11F/3-20L1 scFv in Pichia pastoris 
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—The parent antibody frameworks 
used in this study are both of human origin although there are 
differences in the way they were obtained. The AJJ11F scFv 
was derived from a monoclonal antibody isolated from a cyto- 
megalovirus-seropositive blood donor (18, 19). It originates 
from the 1GHV3-30 and 1GKV3-11 genes, which both have 
acquired a number of mutations (21). This scFV recognizes both 
intact glycoprotein B from cytomegalovirus and peptides mim- 
icking the AD-2 epitope (21, 28). The hapten (FITO-speciHc 
scFv FITC 8 was derived from a synthetic scFv library, which 
had been constructed by shuffling of human CDR sequences 
into a single framework consisting of the human IGHV3-23 
and 1GLV1-47 genes (20). The CDR sequences utilized by this 
scFv originate from IGUV3-7 and 1GHV3-23 in the case of 
CDRH1 and CDRH2. IGLV1-40 and 1GLVI-40 or 1GLV1-50 
in the case of CDRL1 and CDRL2, and IGLV1-47 in the case of 
CDRL3. Except for the CDRL1 loop, which is one residue longer 
than the IGLV1-47 germline length, the CDR loops of tho 
FITC8 scFv are of the same length as the loops normally 
encoded by the framework genes. As the structures of the 
two scFv have not been determined, the loop structures are 
unknown. However, by analyzing the deduced amino acid 
sequences using the tools at the Antibodies - Structure and 
Sequence server (27), the most similar of the observed canon- 
ical classes were identified (Table I). 

Single-codon Insertions and Deletions— To determine the 
capability of the two antibody frameworks to tolerate length 
modifications in the CDR loops, we made single-codon inser- 
tions in CDRH1 and CDRH2 and a single-codon deletion in 
CDRH2. The modifications involved insertions after positions 
31-33 in CDRH1, insertions after positions 57 and 58 in 
CDRH2, and a deletion at position 58 in CDRH2 (Fig. 1). All 
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Examples of various single-codon modifications in .,rFi< rlunes based on tht 
belonging of the CDR loops, reactivity of the scFv with the original antigens i 
and the unfolding temperature of selected clone 
Modification refers to the nature of the changes in loop length; Ins indicates insert 
[MOT unique numbering (7). Canonical class indicates the combination of canonical ! 
canonical structure classification (27). The altered canonical structure is indicated i 
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itive and negative clones from each 
library were sequenced to determine the nature of the modifi- 
cations, and the analysis showed that a wide range of amino 
acids was inserted at the intended positions. To determine the 
effect of these length modifications on the structure of the 

identified by the automatic canonical structure classification 
(27). A number of examples from each insertion library and the 
deletion variants are presented in Table I. 

As the AEUF-based libraries were only tested for the pro- 
duction of FLAG-tagged proteins, they had to be characterized 
further to determine whether the scFv were functionally 
folded. This was done by analyzing the antigen-binding prop- 
erties of the modified clones. Although changes in loop struc- 
ture may be associated with a loss of antigen recognition, 
specific recognition of an antigen will confirm that the polypep- 
tide chain 13 correctly folded as this is a requirement for it to 
function as a framework for the antigen-binding site. Analysis 
of expression supematants of randomly picked clones (includ- 
ing the deletion variants) by ELISA or by using the BIAcore 
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tantly, this analysis showed that most of the AEUF-based 
clones had also retained their specificity for the original viral 

number of irrelevant antigens (see "Experimental Proce- 
dures"), none of the clones displayed any cross-reactivity (data 
not shown), demonstrating that the modified scFv clones re- 
tained a high degree of specificity for the original antigens and 
therefore likely also assumed a correct immunoglobulin fold. 

A number of clones of each specificity, chosen to exemplify 
the different modifications, were produced at a large scale to 
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uremente with the purified monomers of the ASV07, ASV10, 
ASV35, FSV43, FSV61, and FSV84 clones confirmed the pre- 

(Table 1 and Fig. 2>. Furthermore, evaluation of the reaction 

fications did not affect the" dissociation rates of the FITC8- 
based clones to any greater extent (Fig. 2fl). The thermal 
stability of the purified monomers was determined by DSC, and 
all tested clones displayed unfolding temperatures very similar 
to the parent scFv (Table I), further verifying that the ICHV3- 
derived antibody frameworks tolerate single-codon ir 
CDRHl and H2 very well. 




Fig. 2. The single-codon modifications did not affect the reaction rate kinetica of the FITCSbated «cFv variants to any greater 
extent. Representative BlAcore ncnsorgrame of AEUF-based scFv and FITC8-bascd scFv analyied on streptavidine-bound viral peptide (A) or 
streptavidme-botmd FITC IB), respectively. Dissoc.ation rate constants ikj were calculated from multiple measurements and are presented as the 



As insertions and deletions have been demonstrated to occur 
naturally in both heavy and light domain V genes (8), we 
decided to extend this study and also evaluate the stability of a 
previously produced AEUF-based scFv variant with an inser- 
tion in CDRL1 (AE11F/3-20L1) (21). The modified CDRL1 of 

the germline gene from which AEI IF originates. This clone has 
also been demonstrated to recognize both the epitope- 
mimicking peptide and intact, recombinant glycoprotein B, albeit 
with a lower affinity than the affinity matured AEI IF scFv (21. 
32). The thermal stability of the AE11F/3-20L1 scFv was deter- 
mined as before after purification of monomelic scFv, and the 
unfolding temperature was found to be similar to that of the 



original scFv (Table I), thus indicating that not only heavy but 
also light domain CDR tolerate modifications of this nature well. 

Grafting of CDRH1 Loops from Distantly Related IGHV 
Genes— A3 all of the insertions and deletions described so far 
were introduced at the tips of the hypervariable loops, the parts 
of the immunoglobulin fold that best can be expected to accom- 
modate such modifications, we decided to introduce more ex- 
tensive modifications to investigate the effect of such changes 
of antibody sequence and structure. These modifications were 
introduced into and immediately adjacent to CDRH1 of the 
AEI IF framework by the CDR-shuffling technique (22) using 
CDR sequences isolated from activated human B cells. Se- 
quences originating from the IGHV4 subgroup were chosen for 
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Table II 

Deduced amino acid sequences, germline gene origin, and canonical structure class belonging of the CDRHI loops of the AEIIF scFu and the 
CDRHI grafted variants of this 

Ammo acid sequences are aligned and numbered in accordance with the IMGT unique numbering 17) and gaps thereby introduced are indicated 
by clashes. Amino acids that are part of the CDR1-IMGT (7) arc underlined. Dots indicate identity with the AE11F sequence. Canonical structures 
were determined by automatic canonical structure classification (27). 
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the grafting as these are only distantly related to the (GHV3 
CDR and therefore allow for a higher degree of variability. In 
addition, genes from the IGHV4 subgroup encode loops of dif- 
ferent lengths than genes from the [GHV3 subgroup, including 
loops of the same length as the ones created by the single codon 
insertions in CDRHI, thus enabling a comparison with these 
modifications. Sequencing of randomly picked clones showed 
that seemingly functional, i.e. in-frame and without stop 
codons, IGHV3 genes carrying IGHV4-derived CDRHI se- 
quences were obtained (Table II). However, when analyzing 
crude expression supematants of the constructs, it was found 
that all of the clones had lost the original antigen specificity 
and instead acquired a polyreactive character (Fig. 3). 
To further investigate this polyreactive nature of the 

presence of loop lengths different from the one used by the 
parent antibody (Table II). As judged by analytical gel filtra- 
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in Fig. 4, the spectra of both of the CDRHI-grafted 
displayed a strong negative signal near 200 nm, which is in- 
dicative of unordered polypeptides (33). For a comparison, the 
spectra of both the parent scFv and the FITC8 scFv displayed 
a weak negative signal near 217 nm, which is characteristic of 
the (3-sheet conformation of antibody domains (Fig. 4). The 

codon modifications, such as the AE11F/3-20L1 and^he FSV43, 
which gave rise to nearly identical spectra as the parent scFv 
(data not shown). When analyzed by DSC, no unfolding tem- 
peratures could be determined for either of the E3 or E6 scFv, 
suggesting that the proteins already were in an, at least partly, 
unfolded state. Thu3, by inserting these only distantly related 
CDR sequences into the IGHV3 framework, the boundaries 
that define a stable immunoglobulin fold had apparently been 



of both insertion! 

well as more extensive modifications in the CDR of two anti- 
body fragmenta with different specificities and assessed the 
thermal stability and the antigen binding properties of the 
resulting proteins. 

The single-codnn modifications were well tolerated by the 
two scFv frameworks as determined by the thermal stability 
measurements and the high ratio of functional clones despite 
the Tact that they created both loop lengths that do not occur 
normally within the human IGHV3 subgroup and combina- 
tions of loop lengths that do not exist in the human germline 
repertoire. Insertion of one residue in CDRH2 of the two scFv 
studied here creates a loop length (CDR2-IMGT length 9 amino 
acids) that is not naturally encoded by any IGHV genes except 
for the only member of the IGHV6 subgroup (7). This loop 

mnoglobulin 
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residue in CDRHI produces a loop length (CDR1-IMGT length 
9 amino acids) that occurs naturally within the human 1GHV4, 
but not the IGHV3 subgroup, and which could correspond to 
canonical structure 2 as judged by the automatic canonical 
structure classification. This coexistence of canonical structure 
2 in CDRHI with canonical structure 3 in CDRH2 (Table I) 
does not occur naturally within the human IGHV germline 
repertoire, although it has been observed in hypermutated 
antibodies with insertions in CDRHI (8). In addition, the struc- 
ture classification also revealed that a large number of the key 
residue requirements for canonical structure 2 were not ful- 
filled (27), i.e. the thus modified CDRHI loops either take on 
structures not covered by the described canonical structures or 
adopt the observed structure corresponding to this loop length 
despite the presence of a large number of disallowed amino 
acids at key residue positions. Irrespective of the circum- 
stances, the insertions in CDRHI seem to, like the rest of the 
single-codon modifications, give rise to scFv that are correctly 
folded and stable. 
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lin V region genes are evolved (8-H) and which may expand 
the available repertoire of antibody hypervariable loop lengths 
and structures. Although sequence modifications of this kind, 
especially insertions, have also been exploited in antibody en- 
gineering, knowledge about the effects of these modifications 
on protein stability and antigen recognition is still limited. 
Such factors are critical as they determine the success of this 
mode of molecular evolution, whether employed by nature or by 
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the context of an IGHV3 framework. Apparently func 
antibodies belonging to the IGHV3 subgroup with in 
CDRHI and CDRH2 leading to CDR-IMGT loop lengths of 9 
amino acids have in fact been described by others (8, 34, 35). As 
the deletions at position 58 in CDRH2 of both scFv give rise to 
loop lengths that are used by other members of the IGHV3 
subgroup, it is not entirely unexpected that these modifications 
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are tolerated by the scFv frameworks studied here. Further- 
more, in a previous study, we have found that single-codon 
deletions, some of which have also been shown to be functional, 
occur in antibodies belonging to the 1GHV3 subgroup at or 
immediately adjacent to position 58 (12). The single-codon 
modifications of antibody sequence space we have presented 
here are in other words highly representative of changes that 
may occur naturally as a consequence of the somatic hypermu- 



cided to investigate the possibility of using CDRHl seqi 
originating from this subgroup to diversify the AEllF scFv. 
This approach resembles evolution through receptor revision, 
which occurs in vivo (36, 37) and has also been shown to 
provide a selection advantage in vitro (38). However, grafting of 
CDRHl loops of different lengths from the IGHV4 subgroup 
into the IGHV3 framework used by the AE11F scFv resulted 
not only in a loss of the original antigen specificity but also in 
the acquisition of a polyreactive character, even when not hav- 
ing been put through a potentially denaturing purification 
process (39), by the thus modified scFv clones (Fig. 3). This 



polyreactivity is most likely due to a destabilized or inappro- 
priately folded V domain, as demonstrated by the CD spectra of 
two of the clones (Fig. 4). Destabilizing effects of loop grafting 
into an antibody framework have been reported previously 
(40), but in that particular case, the grafted sequences were 
totally unrelated to antibody hypervariable loops. The use of 
naturally occurring CDR sequences for grafting into immuno- 
globulin frameworks often ensures that the inserted loops are 
optimally functional as they have been proofread and selected 
for functionality during the formation of the B cell receotors. 
Our data show, however, that the functionalit; 
loops also depends on the framework they ar 
even if they are natural immunoglobulin sequences. The reason 
for the observed effects probably lies in the differences in cer- 
tain key residues between the 1GHV3 and IGHV4 frameworks. 
In fact, many of the amino acids that differ between the origi- 
nal AEllF sequence and the grafted sequences are residues 
that are used to define the canonical structures (27, 31). In 
addition, Tramontane et al. (41) have shown that framework 
residue 80 of the heavy V domain packs against residues in 
both CDRHl (position 30) and CDRH2 (position 58) and that it 
is an important determinantof the conformation of the CDRH2 
loop. A subsequent mutational study has also shown that the 
nature of this residue determines the binding characteristics of 
an antibody by influencing the conformation of the heavy chain 
CDR loops (42). The AEllF framework has, like all unmutated 
antibodies belonging to the 1GHV3 subgroup, an Arg at posi- 
tion 80, whereas all genes belonging to the IGHV 4 subgroup, 
from which the CDRHl sequences were obtained, encode a Val 
residue at this position in their germluie configurations. The 
larger, charged Arg possibly causes clashes with the IGHV4- 
derived residues in and adjacent to CDRHl, which leads to an 
improper fold and poor stability of the resulting scFv product. 

In conclusion, we demonstrate here that single amino acid 
insertions in both CDRHl and H2 and deletions in CDRH2, 
which are highly representative of modifications that occur 
naturally in regions of the hypervariable loops known to be 

receptors, are well tolerated and permit production of stably 
folded proteins. This is true despite the fact that the thus 
modified loops do not fulfill the key residue requirements for 
canonical loops of the corresponding length or are of a length 
not associated with a known canonical structure (27). This 
demonstrates the plasticity of antibody V domain frameworks 
belonging to the important IGHV3 subgroup, which makes up 
a large fraction of all human antibodies (43), and its capacity to 
tolerate modifications that expand sequence and structure 
space beyond the limits set by the germline-encoded diversity. 



